Belief Propagation on Uncertain Schema Mappings in Peer Data Management Systems 1 Philippe

نویسندگان

  • Philippe Cudré-Mauroux
  • Karl Aberer
چکیده

Until recently, most data integration techniques involved central components, e.g., global schemas, to enable transparent access to heterogeneous databases. Today, however, with the democratization of tools facilitating knowledge elicitation in machine-processable formats, one cannot rely on global, centralized schemas anymore as knowledge creation and consumption are getting more and more dynamic and decentralized. Peer Data Management Systems (PDMS) provide an answer to this problem by eliminating the central semantic component and considering instead compositions of local, pair-wise mappings to propagate queries from one database to the others. In the following, we give an overview of various PDMS approaches; all the approaches proposed so far make the implicit assumption that all schema mappings used to reformulate a query are correct. This obviously cannot be taken as granted in typical PDMS settings where mappings can be created (semi) automatically by independent parties. Thus, we propose a totally decentralized, efficient message passing scheme to automatically detect erroneous schema mappings in a PDMS. Our scheme is based on a probabilistic model where we take advantage of transitive closures of mapping operations to confront local belief on the correctness of a mapping against evidences gathered around the network. We show that our scheme can be efficiently embedded in any PDMS and provide an evaluation of our techniques on large sets of automatically-generated schemas.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Self-Organizing Schema Mappings in the GridVine Peer Data Management System

GridVine is a Peer Data Management System based on a decentralized access structure. Built following the principle of data independence, it separates a logical layer – where data, schemas and mappings are managed – from a physical layer consisting of a structured Peer-to-Peer network supporting efficient routing of messages and index load-balancing. Our system is totally decentralized, yet it f...

متن کامل

Self-Organizing Schema Mappings in the GridVine Peer Data Management System [Demonstration]

GridVine is a Peer Data Management System based on a decentralized access structure. Built following the principle of data independence, it separates a logical layer – where data, schemas and mappings are managed – from a physical layer consisting of a structured Peer-to-Peer network supporting efficient routing of messages and index load-balancing. Our system is totally decentralized, yet it f...

متن کامل

Emergent Semantics: Rethinking Interoperability for Large Scale Decentralized Information Systems

In the past, the problem of semantic interoperability in information systems was mostly solved by means of centralization, both at a system and at a logical level. This approach has been successful to a certain extent, but offers limited scalability and flexibility. Peer-to-Peer systems as a new brand of system architectures indicate that the principles of decentralization and self-organization...

متن کامل

Peer Data Management System

MAIN TEXT A Peer Data Management System (PDMS) is a distributed data integration system providing transparent access to heterogeneous databases without resorting to a centralized logical schema. Instead of imposing a uniform query interface over a mediated schema, PDMSs let the peers define their own schemas and allow for the reformulation of queries through mappings relating pairs of schemas (...

متن کامل

International Workshop on Emergent Semantics and Ontology Evolution

Until recently, most data interoperability techniques involved central components, e.g., global schemas or ontologies, to overcome semantic heterogeneity for enabling transparent access to heterogeneous data sources. Today, however, with the democratization of tools facilitating knowledge elicitation in machineprocessable formats, one cannot rely on global, centralized schemas anymore as knowle...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006